Three novel bird strike likelihood modelling techniques: The case of Brisbane Airport, Australia

The risk posed by wildlife to air transportation is of great concern worldwide. In Australia alone, 17,336 bird-strike incidents and 401 animal-strike incidents were reported to the Air Transport Safety Board (ATSB) in the period 2010-2019. Moreover, when collisions do occur, the impact can be catastrophic (loss of life, loss of aircraft) and involve significant cost to the affected airline and airport operator (estimated at globally US$1.2 billion per year). On the other side of the coin, civil aviation, and airport operations have significantly affected bird populations. There has been an increasing number of bird strikes, generally fatal to individual birds involved, reported worldwide (annual average of 12,219 reported strikes between 2008-2015 being nearly double the annual average of 6,702 strikes reported 2001-2007) (ICAO, 2018). Airport operations including construction of airport infrastructure, frequent take-offs and landings, airport noise and lights, and wildlife hazard management practices aimed at reducing risk of birdstrike, e.g., spraying to remove weeds and invertebrates, drainage, and even direct killing of individual hazard species, may result in habitat fragmentation, population decline, and rare bird extinction adjacent to airports (Kelly T, 2006; Zhao B, 2019; Steele WK, 2021). Nevertheless, there remains an imperative to continually improve wildlife hazard management methods and strategies so as to reduce the risk to aircraft and to bird populations. Current approved wildlife risk assessment techniques in Australia are limited to ranking of identified hazard species, i.e., are ‘static’ and, as such, do not provide a day-to-day risk/collision likelihood. The purpose of this study is to move towards a dynamic, evidence-based risk assessment model of wildlife hazards at airports. Ideally, such a model should be sufficiently sensitive and responsive to changing environmental conditions to be able to inform both short and longer term risk mitigation decisions. Challenges include the identification and quantification of contributory risk factors, and the selection and configuration of modelling technique(s) that meet the aforementioned requirements. In this article we focus on likelihood of bird strike and introduce three distinct, but complementary, assessment techniques, i.e., Algebraic, Bayesian, and Clustering (ABC) for measuring the likelihood of bird strike in the face of constantly changing environmental conditions. The ABC techniques are evaluated using environment and wildlife observations routinely collected by the Brisbane Airport Corporation (BAC) wildlife hazard management team. Results indicate that each of the techniques meet the requirements of providing dynamic, realistic collision risks in the face of changing environmental conditions.

adherence to PLOS ONE policies on sharing data and materials. E.2.b.2 Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials." (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competinginterests). If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.
See E.2.b.1 for revised Competing Interests statement.

E.2.b.3
Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Statements have been included in revised Cover
Letter as requested.

E.3
Please upload a new copy of Figure 7 as the detail is not clear. E.4 Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly.
The Supporting Information section has been updated to include more explanatory captions. The following text blocks have been moved from Results to Methods sections: The statistical analysis around the ABC approach can be improved by including objective examination of whether some 'days' have higher collision risk (see suggestion below in point 7).
We thank the reviewer for this suggestion. We have adopted the approach suggested by the reviewer in point 7 and applied GLM to associate characteristics of days to collision likelihood.

R1.5.3
Theoretical concepts behind the three methods -Algebraic modelling, Bayesian networks and K-means cluster analysis are explained well, but are sometimes superfluous as these techniques are widely used, and these sections can be reduced by referring to relevant literature wherever applicable, to keep the text tight, to the point, and focused on the application of the technique.
Much of the underpinning theoretical concepts have been reduced/removed in line with the reviewer's suggestion.

R1.5.4
More details on how these techniques have been applied such as variables and analytical treatments need to be mentioned explicitly, so that the approach can be clearly understood and replicated.
We have updated the Methods section to better indicate the variables and analytical treatments, including software and libraries, that were used in the study.
R1.5.6 Some analysis that these techniques depend on such as PCA, ROC, and validation techniques for Cluster analysis are not mentioned in Methods and appear abruptly in Results, making it difficult to follow how the methods were applied.
We have carefully reviewed the manuscript to address this issue. Sections of text in the Results section that introduce a method or technique (e.g., ROC, PCA, internal cluster validation such as silhouette maps) have now been moved to Methods, with only actual results reported in the Results section.
R1.6 L 234 -320 Algebraic modelling approach has been explained in sufficient detail and clarity to allow replication. However, the content is verbose and can be revised for brevity. For instance, repetition of the mathematical notation in L274 -276 is redundant, and L283-301 explanations of Gaussian and Sigmoid functions can be reduced.
We have removed the repeated mathematical notation in L274-L276 and reduced the text describing gaussian and sigmoid functions.

R1.7 L 304 -309 The choice of expected abundance thresholds in classifying collision
We do thank the reviewer for this suggestion. We did use Binomial GLM to model strike probability on daily likelihoods as 'low', 'medium', 'high' is subjective and arbitrary. No statistical treatment has been presented that relates expected abundance to air strike probability. An objective way would be to predict collision probability using binomial GLM (logistic regression) by modelling air strike (1/0) on daily expected abundance of hazard species (eh), and to subsequently classify abundance thresholds based on stipulated cut-offs of predicted collision probability.
abundance and seasonality values. From this modelling we were able to determine cut-offs. As the modelling was binary (0/1), we changed the formulation of the collision likelihood function so that the function returns either 'Low' or 'Elevated' as values for collision likelihood.
Details of the GLM have been added to the Methods section.
R1.8 L 310 Bayesian networks have been used to examine causal relationships between factors leading to collisions. Which factors / variables have been examined in this study are not mentioned in Methods.
The variables used have been included in the Methods section (in Table 2).
R1.9 L 316 In Bayesian networks, a node (factor) has a conditional probability distribution There are 30 such conditional probability tables (one for each node of each Bayesian Network Model). To make the tables available to reviewers, and ultimately readers, we have created a github repository which we reference in the MS.
https://github.com/robertandrews59/ABCBirdstrike NB. As we are proposing to release additional Brisbane Airport Corporation related data, we have approached BAC Risk and Governance to grant approval for the release. As approval has not yet been granted, the github repository is set as 'private'.
R1.10 L 317-333 Content is verbose and can be revised for brevity Some of the content has been moved to the Discussion section and the content in the lines indicated by the reviewer have been revised for brevity.

R1.11
In (K-means) Clustering, L 338 Although the objective was 'to see whether particular day types are associated with bird strikes', the following section on cluster analysis does not explain how this objective is answered from the knowledge of 'similar days/conditions' groups.
We have added a sentence describing how we relate day types and clusters.
Thus, in our approach, each cluster will represent a day 'type' and the principal components, identified by PCA, will give insights into the data attributes that differentiate between day types. We have adopted the reviewer's recommendation and reduced data standardisation to a single sentence with references.

Results
Much of the Results are actually details of methods or interpretation of results that should be shifted to Methods and Discussion sections, respectively. Results should provide We thank the reviewer for this observation. We have carefully reviewed the manuscript and moved material adequate reporting of the output statistics of each analysis (see detailed suggestions below).
not directly reporting results to either the Methods or Discussion section.
Our response to R1.5.1 details the blocks of text moved from Results to Methods.
The following lines have been moved to the Discussion section: R1.14 Figures lack legends, axis labels and have been presented without diligence that make them difficult to fully comprehend. There are too many figures, and figures of similar type (Fig 1-3, Fig 4-6, Fig 8-10, Fig 11-12, Fig 13-15) can be grouped with species names as labels / icons. Other recommendations specific to each figure are as follows: We have left Figures 11-12 as we are concerned that grouping the figures will reduce their size and make them unreadable. However, the tables/graphs accompanying child nodes seem to be unconditional data distributions that do not allow readers to interpret the nature of the relationships between nodes. Authors can think of presenting the conditional probability tables of child nodes as contingency tables on edges/arrows, or some other concise, interpretable ways, so that readers (including managers) can understand how some environmental changes can promote factors that ultimately lead to higher collision risk.
We thank the reviewer for this suggestion. We have modified images of each of the three trained Bayesian Network models to include: (i) strength of influence indicators, which show the degree to which a parent node influences a child node -strength of influence is given as width of arc connecting the nodes, and (ii) subsets of the conditional probability We have added that the cut-offs were determined through application of GLM of (expected) abundance to strike probability (in line with suggestion in R1.7).

R1.19
L 397-403 should be included in Discussion and removed from here.  Table 4 showing the number of strikes as a function of cluster type is noninformative as we do not know how many days are there in each cluster, to be able to infer if the frequency of strikes was more in one cluster than other. Also cluster difference in collision probability has not been tested statistically, hence the inferences are not reliable. Authors should revise this section (and the corresponding Methods) by including a statistical test of difference in collision probability between day clusters, similar to my suggestion in point 7.
The number of days included in each cluster have been added to table 4.
We have addressed this suggestion by calculating, for each hazard species, collision likelihood (based on observed strikes) over the entire dataset, and then comparing this with collision likelihood within clusters. The test of significance applied was "overlap" of confidence intervals (at 95% CI). If the confidence interval of a cluster overlapped with the confidence interval of the hazard species, the collision likelihoods were taken to be not significantly different. This has been added to the Methods section and reported in Results.

R1.26 L 497 -511 details of Principal Component
Analysis should be shifted from here to Methods. PCA methods should clearly list the variables included and analytical details (data standardization, correlation vs covariance matrix use etc.) in Methods. PCA results should include the variance explained and variable loadings of components in Results.
The details of the PCA have been shifted to Methods as suggested.
The detailed PCA results, including variable loadings, are included in the Supporting Information section.

R1.27
Cluster analysis results should include cluster centroid descriptions in terms of variable means for each hazard species and number of days in each cluster in Results so that The number of observations in each cluster is provided in the Results section (in Table 4). For clarity, they have been included in the caption of each figure.
readers (including managers) can understand how days are grouped based on conditions and which condition set increases the risk of collision.

R1.28
Discussion is weak and largely reiterates the advantages of the approach developed in the MS. It can be strengthened by: a) discussing the findings of applying this approach to the current study in terms of which dynamic factors increased collision risk of a hazard species, and interpreting these results from ecological perspective (see comments in Results section); b) referring to literature on how bird strikes are mitigated across the world beyond simple categorization of hazard species by their risk/severity; and c) how the approach developed in the MS can help advance these current approaches of bird strike mitigation, thereby expanding the scope of the work.
The Discussion section has been revised in line with the reviewer's suggestions. Material from the Results section that was more interpretation of results than pure results, has been moved to the Discussion section. Literature relating to mitigation of bird strikes around the world has been included. We add to the manuscript a positioning of approach among existing mitigation approaches and point out the gap that our approach fills and thus, how it helps advance the state of the art. We have tempered this statement to better reflect PCA analysis as indicating attributes contributing to clustering. We observe that the attributes identified through PCA are those used in GLM modelling described in Methods and Results for algebraic modelling.

R1.30
Conclusion reiterates the approaches used, much of which have already been covered. Instead, it can be reduced to a few important concluding statements on the application of these techniques in reducing bird strike problems.
We thank the reviewer for this suggestion. The Conclusion has been rewritten to emphasise the contributions of the techniques to bird strike, and how the models may be employed to reduce risk of collision.

Reviewer 2 Comments
R2.0.1 The paper presents three ways to inform collision risk from wild birds. This is an important study that has real world applications. The authors make a case for their study by referencing other risk assessment frameworks and approaches.
We thank the reviewer for this observation. We have significantly revamped the Methods to make clear our approaches, assumptions, and tools used in conducting the study.
However being a study that describes novel methods for risk assessment, a solid, readable and fluid methods section is indispensable to the manuscript, which is currently lacking.
R2.0.2 There is scope for clarifying description of methods. In particular, the descriptions are presently difficult to follow as the important terms used in the calculations are not ordered. One has to refer to previous pages to understand.
In accordance with the reviewer's comments, we have significantly revised the manuscript and moved material out of Results and into Methods section to make each section more self-contained and readable. In particular, the following blocks of text have been moved from Results to Methods: The conceptual parts of the 3 methods could be included as they apply to the problem being presented, or skipped altogether.
We have addressed this in Methods. We thank the reviewer for pointing this out. We have used 'count' rather than population throughout the manuscript.
R2.0.8 Moreover non-detection of birds has not been included in the methods, while it has been acknowledged in results (lines 402-403).
We have added a sentence about non-detection and its affect on count, proximity count, and abundance in Methods.
We note that (i) non-detection of individuals from the hazard species (either through not being observed in one, or any zone during the daily count, or not being involved in any harassment activities on a given day) will result in countz h being 0 for each such zone on the airfield, and will be reflected in the countp h and abundanceh values for the day, and (ii) overall wildlife abundance on the airfield may be calculated similarly as for a single species.

R2.1
Not clear if how variation in bird numbers in Paton's approach is taken into account, and how this approach does not account for seasonal/environmental changes. Authors may elaborate on this.
The issue with Paton's approach is that it provides a means of ranking hazard species according to risk, but does not provide a means of deriving a risk assessment on any given day.

R2.2 WHM?
Wildlife Hazard Management. This is introduced in the second sentence of the Introduction.
R2.3 Study hazard species are those that have a history of collision with aircrafts. However the authors have stated in the introduction that it is important to include species that may not have a history but may still be potential hazards Two of the species, Nankeen Kestrel and Cattle Egret, have a history of collision. Straw-necked Ibis is a species that is abundant on the airfield, frequently observed in the vicinity of runways and taxiways, but which is rarely involved in strikes.
We have tried to clarify this in the Materials and Methods section where we describe the hazard species.

R2.4
For context, it may be useful to define the main food source (for reference in line 225), habitat requirements/nesting ecology of the hazard species.
Food sources and breeding have been added to the description of the hazard species in the Study Area and Data Collection section.
R2.5 Para beginning at line 237: for clarity, I suggest the authors provide some context of the references provided here (eg. Carter in his study on risk assessment and prioritisation of wildlife hazards, the author used ….).
We have amended this paragraph in line with the reviewer's suggestion as: